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Abstract 

Vegetables are critical for human health as they are a source of multiple vitamins including vitamin E (VTE). In plants, 
the synthesis of VTE compounds, tocopherol and tocotrienol, derives from precursors of the shikimate and 
methylerythritol phosphate pathways. Quantitative trait loci (QTL) for ^-tocopherol content in ripe fruit have 
previously been determined in an Solanum pennellii tomato introgression line population. In this work, variations of 
tocopherol isoforms (a, p, 7, and 5) in ripe fruits of these lines were studied. In parallel all tomato genes structurally 
associated with VTE biosynthesis were identified and mapped. Previously identified VTE QTL on chromosomes 
6 and 9 were confirmed whilst novel ones were identified on chromosomes 7 and 8. Integrated analysis at the 
metabolic, genetic and genomic levels allowed us to propose 16 candidate loci putatively affecting tocopherol 
content in tomato. A comparative analysis revealed polymorphisms at nucleotide and amino acid levels between 
Solanum lycopersicum and S. pennellii candidate alleles. Moreover, evolutionary analyses showed the presence of 
codons evolving under both neutral and positive selection, which may explain the phenotypic differences between 
species. These data represent an important step in understanding the genetic determinants of VTE natural variation 
in tomato fruit and as such in the ability to improve the content of this important nutriceutical. 

Key words: Fruit metabolism, Solanum pennellii, tocopherol, tomato, vitamin E. 



Introduction 

Vegetables are critical for human health as they are 
a source of multiple vitamins and other essential com- 
pounds. In particular, tomato fruits are an important 
dietary source of antioxidants for humans due both to the 
fact that have a high intrinsic content of these compounds 
and the elevated consumption of this crop by the western 
population. The main non-enzymatic antioxidants found 
in tomato fruits are ascorbic acid (VTC), lycopene and 
carotenoids, phenolics, and vitamin E (VTE) (Abushita 



et al, 1997; Frusciante et al., 2007). Recent studies have 
reinforced the hypothesis of beneficial effects of VTE on 
human health, mainly in the prevention of coronary heart 
disease, breast cancer, and protection against nicotine- 
induced oxidative stress in the brain (Das et al, 2009; Ros 
2009; Zhang et al, 2009). Although its function in plants 
remains somewhat undefined, several reports link VTE to 
the protection of pigments, proteins, and polyunsaturated 
fatty acids of the photosynthetic apparatus against reactive 



Abbreviations: IL, introgression line; LRT, likelihood ratio test; MEP, methylerythritol phosphate; QTL, quantitative trait loci; ROS, reactive oxygen species; SK, 
shikimate; VTC, vitamin C; VTE, vitamin E. 
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oxygen species (ROS) generated during photosynthesis 
(Semchuk et al, 2009). It has additionally been proposed 
that VTE interacts with other antioxidant mechanisms in 
order to maintain cellular redox homeostasis (Foyer and 
Noctor, 2005). 

The synthesis of VTE occurs in photosynthetic organisms 
and its major constituents are a group of amphipathic 
molecules containing a polar chromanol head group derived 
from homogentisate and a polyprenyl lipophilic side chain, 
products of the shikimate (SK) and methylerythritol 
phosphate (MEP) pathways, respectively. VTE compounds, 
collectively termed tocochromanols, can be classified into 
two groups on the basis of the degree of saturation of their 
hydrophilic tails. Tocopherols, which are the most abun- 
dant in plants, have saturated tails derived from phytyl 
2P, whereas tocotrienols have an unsaturated tail derived 
from geranylgeranyl 2P. The VTE biosynthesis pathway 
proceeding from the reduction of hydroxyphenylpyruvate 
to homogentisate is considered the 'VTE core pathway' 
and comprises seven enzymes: 4-hydroxyphenylpyruvate 
dioxygenase (HPPD, EC 1.13.11.27), homogentisic acid 
geranylgeranyl transferase (HGGT/HST, EC 2.5.1.-), 
homogentisate phytyl transferase (VTE2, EC 2.5.1.-), di- 
methyl-phytylquinol methyl transferase (VTE3, EC 2.1.1.-), 
tocopherol cyclase (VTE1, EC 5.3.-.-), y-tocopherol C- 
methyl transferase (VTE4, EC 2.1.1.95), and phytol kinase 
(VTE5, EC 2.7.-.-). There are four naturally occurring 
forms of tocopherols and tocotrienols (a, p, y, and 8), 
which differ in the position and number of methyl groups 
on the chromanol ring (Munne-Bosch and Alegre, 2002). 
Although all VTE isoforms are potent antioxidants in vitro, 
ot-tocopherol is the most active in terms of vitamin activity, 
partly because it is retained in the human body in 
preference to other tocopherols and tocotrienols (Traber 
and Sies, 1996). In plants, tocochromanol has been found 
exclusively in plastids. Since it has not been proved thus far 
that any isoform can be transported within the plant, and 
the enzymes of the core pathway have been found in 
plastids (Sun et al, 2009), it is assumed that the biosynthesis 
also occurs in this compartment. 

Although the VTE biosynthetic pathway was elucidated 
in 1979 (Soil and Schultz, 1979), the identification of the 
genes involved is much more recent. In the last decade, via 
the use of genetic and genomics-based methods, the genes 
encoding the enzymes for most of the steps of the VTE core 
biosynthesis pathway have been identified and cloned. 
However, as yet, this is confined to the model organisms 
Arabidopsis thaliana and Synechocystis sp. PCC6803 (Li 
et al, 2008). Indeed, the characterization of VTE mutants 
and transgenic lines has provided considerable insight into 
the regulatory network of tocochromanol biosynthesis (for 
review see Mene-Saffrane and DellaPenna, 2009; Falk and 
Munne-Bosch, 2010). These combined studies have addi- 
tionally suggested roles for VTE compounds beyond their 
antioxidant function including their participation in diverse 
physiological processes including germination, photoassimi- 
late partitioning, growth, leaf senescence, and plant 
responses to abiotic stress (Falk and Munne-Bosch, 2010). 



Moreover, several studies have demonstrated a close in- 
teraction between VTE and other metabolic pathways. 
Tomato fruits overexpressing phytoene synthase (PSY), 
a key enzyme in carotenoid biosynthesis, displayed in- 
creased levels of tocopherol (Fraser et al, 2007). Moreover, 
tocochromanol content is additionally affected when the 
post-chorismate pathway is manipulated. Arabidopsis trans- 
genic plants overexpressing the bacterial bi-functional 
chorismate mutase (CM)/prephenate dehydratase (PDT), 
displayed significantly higher levels of phenylalanine, as well 
as y-tocopherol and y-tocotrienol, besides other secondary 
metabolites (Tzin et al, 2009). Plant tocochromanol bio- 
synthesis is furthermore subjected to control by both 
environmental and endogenous signals. In agreement with 
this statement, the silencing of the light response factor DE- 
ETIOLATED 1 resulted in tomato fruits with enhanced 
levels of antioxidants, including carotenoids, flavonoids, 
and tocopherol (Davuluri et al, 2005; Enfissi et al, 2010). 

Cultivated tomato (Solanum lycopersicum) is the most 
consumed vegetable globally. The fact that its wild relatives 
display tremendous variation in metabolite content in both 
leaves and fruits (Schauer et al, 2005), renders wild 
germplasm an important source for metabolic gene discov- 
ery focused on aiding efforts to improve the nutritional and 
industrial quality of crop species (Zamir, 2001; Fernie et al, 
2006; Tohge and Fernie, 2010). Utilizing this approach, 
Schauer et al (2006) reported a detailed metabolite profile 
of 76 tomato introgression lines (ILs) containing chromo- 
some segments of the wild species Solanum pennellii in the 
genetic background of the cultivated S. lycopersicum (cv 
M82; Eshed and Zamir, 1995). Following the quantification 
of 74 metabolites of known chemical structure, they were 
able to identify 889 quantitative fruit metabolic loci for 
variations in the content of amino and organic acids, 
sugars, alcohols, fatty acids, VTC, and VTE. Two of these 
quantitative trait loci (QTL), explaining variation in the 
oc-tocopherol fruit content, were located on chromosomes 6 
and 9. Independent experiments available at the Tomato 
Functional Genomics Database (Fei et al, 2006; http:// 
ted.bti.cornell.edu/) have also revealed differences in to- 
copherol content associated with the exact same genomic 
regions. However, the mechanisms explaining these varia- 
tions are currently not understood, partially due to a lack of 
knowledge of the complete VTE biosynthetic pathway in 
tomato. 

The aim of the current report is to provide a framework 
for associating gene sequence with fruit tocopherol content 
phenotypes by (i) characterizing and mapping all genes 
involved in the VTE biosynthesis pathway in tomato, (ii) 
identifying QTL for the content of the vitamers of VTE 
and their candidate genes, (iii) cloning and sequencing 
these genes from S. pennellii, and (iv) examining evolution- 
ary patterns of candidates genes by comparing orthologues 
from S. pennelli, S. lycopersicum, and A. thaliana. The 
combined results of this study will be discussed in the 
context of the fundamental understanding of the accumu- 
lation of VTE opening further stages for functional 
analyses. 
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Materials and methods 

Plant material 

Tomato seeds from S. lycopersicum L. (cv M82) and S. pennellii 
introgressed lines were obtained from Tomato Genetic Resource 
Center (http://tgrc.ucdavis.edu). Tomato plants were grown in 20-1 
pots under greenhouse conditions: 16/8 h photoperiod, 24±3 °C, 
60% humidity, and 140±40 umol mT s _1 incident photo- 
irradiance. Cloning was carried out from fully expanded source 
leaves and mature fruits (60 d after flowering). For tocopherol 
quantification six ripe fruits were taken from six individual plants 
of ILs 6-1, 6-2, 7-4, 7-4-1, 7-5, 8-2, 8-2-1, 9-1, 9-2-6. Tissue was 
collected, immediately frozen into liquid nitrogen, and stored at 
-80 °C until use. 

Survey of tocopherol biosynthesis enzymes, genome mapping, 
and expression analyses 

The VTE pathway presented in Fig. 1 was outlined by combining 
data reported for enzymatic steps involved in SK, MEP, and VTE 
biosynthesis available on KEGG (Kyoto Encyclopedia of Genes 
and Genomes, http://www.genome.jp/kegg/) and related scientific 
literature. Arabidopsis loci were obtained from the KEGG 
database and these sequences were used to perform TBLASTN 
(Altschul et al, 1990) searches against tomato expressed sequences 
from the Lycopersicon Combined Build # 3 unigene database 
housed by the Solanaceae Genomics Network (http://solgenomic- 
net). The criteria used to determine orthology were 3=40% identity 
at the amino acid level and 3=65% coverage of the Arabidopsis 
protein. For uncompleted unigenes, the coverage cut-off was set at 
2=30%. Based on BLASTN results, all unigene sequences were used 
as queries to identify the corresponding genomic sequences from 
the S. lycopersicum genome assembly — version 2.31 (3,230 sequen- 
ces; 781,381,961 total letters) — The International Tomato Genome 
Sequencing Project at the Solanaceae Genomics Network (http:// 
solgenomic.net). In silico prediction of the subcellular localization 
of tomato unigenes was performed using TargetP (http:// 
www.cbs.dtu.dk/services/TargetP) and ChloroP program (http:// 
www.cbs.dtu.dk/services/ChloroP) based on their deduced amino 
acid sequences. 

The identified tomato genes were mapped onto the Tomato- 
EXPEN 2000 genetic map available at the Solanaceae Genomics 
Network. The genetic positions were obtained by BLASTN 
(Altschul et al, 1990) searches of the identified unigenes and/or 
their corresponding genomic sequences against the entire 
Tomato-EXPEN 2000 map marker sequences database (http:// 
solgenomics.net/index.pl). Map Chart software 2.2 (Voorrips, 
2000) was used to construct the graphical representation of the 
genetic map. 

Expression data for the set of genes analysed here were extracted 
from a TOM1 microarray experiment previously published (Carrari 
et al, 2006). This experiment comprises transcript analyses from 
tomato fruits harvested along development and ripening stages 
(10, 15, 20, 21, 35, 49, 56, and 70 d after anthesis). 

Tocopherol quantification by HPLC and QTL mapping 

Tocopherol extraction was performed as described by Fraser 
et al. (2000) with the following modifications: tomato fruit were 
ground to a fine powder in liquid nitrogen and 500 mg of 
material was extracted with 1.5 ml of methanol and, after vortex- 
mixing, 1 ml of chloroform was added. Following 5 min of 
sonication, 1 ml of Tris buffer (50 mM Tris pH 7.5/1 M NaCl) 
was added. The chloroform phase was recovered and the 
methanol phase (remaining pellet) was re-extracted with chloro- 
form (2 ml). Chloroform extracts were pooled and adjusted to 
a final volume of 4 ml. Two millilitres were dried under nitrogen 
gas and re-suspended in 0.2 ml of 99.5:0.5 hexane/isopropanol. 
The tocopherol content was determined using a Hewlett-Packard 



series 1100 HPLC system coupled with a fluorescence detector 
(Agilent Technologies series 1200). Separation was carried out on 
a normal-phase column Metasil Si (250 mmx4.6 mm, 5 um, 
Varian; Metachem, Torrance, CA, USA) maintained at room 
temperature using an isocratic solvent system (mobile phase) 
consisting of 99.5:0.5 hexane/isopropanol with a flow rate of 1 ml 
min -1 . Eluting compounds were detected and quantified by 
fluorescence with excitation at 296 nm and emission at 340 nm. 
Identification and quantification of tocopherol compounds was 
achieved by comparison with the retention times and peak areas 
of standards purchased from Merck (tocopherol set; Calbiochem 
#613424). A daily calibration curve was carried out using 
a tocopherol solution with a concentration range between 0.31 
ug ml and 5 ug ml for each isoform. Data of tocopherol 
isoforms or total tocopherol content were statistically analysed 
according to Sokal and Rohlf (1981). When the data pull 
presented homoscedasticity, with or without data transformation 
using In or square root, an ANOVA followed by a Dunnett test 
(P<0.05) was used to compare tocopherol content between ILs 
and M82 control. Due to lack of homoscedasticity, a non- 
parametric comparison was also performed by Kruskal-Wallis 
test (P<0.05 and P<0.1). Statistical tests were performed with 
Bioestat 5.0 (Ayres et al., 2007) and InfoStat v. 2009 (Grupo 
InfoStat, FCA, Universidad Nacional de Cordoba, Argentina). 

The position of tocopherol QTL on the Tomato-EXPEN 2000 
map was determined according to the flanking markers of the S. 
pennellii introgression fragments in the analysed ILs (Eshed and 
Zamir, 1995) and mapping of unigenes performed as described in 
Kamenetzky et al (2010). 

Identification of VTE-related pathway candidate genes 

Candidate genes were surveyed along the genomic regions spanned 
by the identified VTE QTL as described in Bermudez et al. (2008). 
All molecular markers mapped onto the selected genomic regions 
were identified in the comparison merging the Tomato-EXPEN 
2000, the Tomato-EXPEN 1992, and the Tomato IL maps by 
using the comparative map web interface of the Solanaceae 
Genomics Network (Mueller et al, 2008). All marker sequences 
were used as query to identify the corresponding unigenes in the 
Solanaceae Genomics Network database. Gene product functions 
were determined according to homology to a previously character- 
ized protein, whose function had been experimentally demon- 
strated in other related plant species, by using the BLASTX 
algorithm (Altschul et al, 1990) against the NCBI non-redundant 
(nr) protein database (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The 
cut-off criteria were: 3=40% identity at amino acid level and 3=65% 
coverage of the orthologous Arabidopsis protein. For uncompleted 
unigenes the coverage cut-off was set at 3=30%. 

Amplification, cloning, and sequencing 

Total RNA from 100 mg of source leaf or fruit tissue was isolated 
using Trizol reagent (Invitrogen, #15596-026) and 1 ug from each 
sample was treated with amplification grade DNase I (Invitrogen, 
#18068-015) to remove potential contamination of genomic DNA. 
The treated RNA was reverse-transcribed to first-strand cDNA 
and primed with oligo(dT) using Superscript First-Strand (Invi- 
trogen, #18080-044) following the manufacturer's protocol. 

Primers were designed using the software Oligo Analyzer 3.1 
(http://www.idtdna.com) on the basis of unigene sequences in- 
cluding the sequence surrounding the putative ATG start and stop 
codons. The primer sequences are available in Table SI in 
Supplementary data available at JXB online. 

Full-length cDNA fragments were generated by PCR using Tag 
Platinum pfx DNA polymerase (Invitrogen, #11708-013). The 
PCR reactions were conducted in a total volume of 50 ul 
containing 0.3 mM each dNTPs, 0.4 uM each primer, 1.5x 
reaction buffer, 1 mM MgS0 4 , -150 ng of cDNA, and 2.5 IU of 
enzyme. The amplification conditions were: 94 °C for 3 min; 35 



Table 1. Identification of tomato genes involved in VTE biosynthesis and QTL- associated candidate genes. A thaliana loci used for evolutionary analyses are underlined. 



Enzyme a 


A. thaliana locus 
(no. amino acids) 


Curated localization h 


Tomato unigene c 


Signal peptide** 


Genomic id e 


Linked 
marker 


Chromosome 
position 9 (cM) 


1-Deoxy-D-xylulose-5-P synthase 

IDYQ FP 9 9 1 71 
\LJ AO , CO 1 . 1 j 


At4g15560 (717) 


Chloroplast 


U567647 (1) 


Chloroplast 
nd 


SL2.31sc05941 

C.I 9 Qrn^74R 


T1704 
TG523 


I (39 CM) 

I I (9Q r\A\ 

i I CIVIj 


2-C-methyl-D-erythritol 4-phosphate 

Q\/nthnQP (F)YF\ FP 111 9fi71 

by 1 1 LI Idbc ^UAn, CO I.I.I .ilO / ) 


At5g62790 (477) 


Chloroplast 


U58581 3 


Chloroplast 


SL2.31SC03701 


C2_At5g23060 


3 (102.5 cM) 


2-C-methyl-D-erythritol 4-phosphate 

rx/tiHvk/ltranQforaQP (PM 1 ^ FP 9 7 7 R01 
UyLIUyiyiLId.1 IblGlctbo ^OIVIO, CO C-.i . I .\j\J) 


At2g02500 (302) 


Chloroplast 


U566797 


Chloroplast 


SL2.31SC04323 


TG528 


1 (127.5 cM) 


4-(Cytidine 5'-diphospho)-2-C-methyl-D-erythritol 

kina^P fl^PF FP 9 7 1 1 AfX\ 

WW iciot; ^IOr l_, Lu c. . /.I.I H-O) 


At2g26930 (383) 


Chloroplast 


U583224 


Chloroplast 


SL2.31SC04133 


CTOC-4-C7 


1 (19 CM) 


2-C-methyl-D-erythritol 2,4-cyclodiphosphate 

byi ili idbc ^lorr, co 4.0. I . I c.) 


At1g63970 (231) 


Chloroplast 


U568497 


Chloroplast 


SL2.31sc03923 


C2_At3g27530 


8 (78 CM) 


4-Hydroxy-3-methylbut-2-enyl- diphosphate 

q\/n+h3c;p fHDC; FP 1 17 7 11 

oyl ILI laoc \\ iLJ-J, LO I . 1 I . 1 . 1 ) 


At5g60600 (717) 


Chloroplast 


U567167 


Chloroplast 


SL2.31sc03876 


C2_At5g60600 


1 1 (79 CM) 


4-Hydroxy-3-methylbut-2-enyl- diphosphate 

i-pHi irt^QP /WDR FP 1 17 1 91 


At4g34350 (466) 


Chloroplast 


U580658 


Chloroplast 


SL2.31SC04323 


C2_At4g34350 


1 (154 cM) 


Isopentenyl diphosphate 5-isomerase 
(IPI, EC 5.3.3.2) 


At3g02780 (284) 
At5g16440 (291) 


Mitochondria 

Chloroplast (Phillips ef a/., 2008) 


U577516 (1) 
U569721 (2) 


Chloroplast 
Chloroplast 


SL2.31SC06101 
SL2.31sc03902 


U49812 
C2_At5g04270 


4 (64 cM) 

5 (112.7 cM) 


Geranyl pyrophosphate synthase 
(GPPS, EC 2.5.1.1) 


At2g34630 (422) 


Chloroplast (Bouvier ef a/., 2000) 


U573523 


Mitochondria 


SL2.31sc03835 


C2_At1g30360 


8(21.3cM) 


Geranylgeranyl pyrophosphate 
synthase (GGPS, EC 2.5.1.29) 


At4g3681 0(371) 
At4g38460(326) 


Chloroplast 
Chloroplast 


U574849 (1) 
U571085 (2) 
U573348 (3) 
I IR7^RR9 (A\ 


Chloroplast 
Chloroplast 
nd 

P h I r\ rn n I a c+ 
Ol 1IUI UpicLbL 


SL2.31SC03748 
SL2.31SC04135 
SL2.31sc03665 

C.I 9 ^1 Qrn^771 


C2_At5g16710 
C2_At1g19340 
CT232 
T0532 


1 1 (31 .4 CM) 

4 (112 CM) 

2 (90.1 CM) 

Q HO rKA\ 
y you oivij 


f^Dran\/lnoran\/l roHi irtaco /PPnR FP 19 1-1 
Olcl dl lyiycl d.1 lyi IfcfUUULcibo l^OOlLJn, CO 1 .O. 1 J 




Ol HUlUpicLbL 




Ol HUiUpicLbL 




CO A+1 Oi7AAlV\ 


Q M 99 rKA\ 


3-Deoxy-D-arabino-heptulosonate-7-P 
synthase (DAHPS, EC 2.5.1.54) 


At1g22410 (527) 
At4g33510 (432) 


Chloroplast 
Chloroplast 

Phlnrnnlacit (Fnti id p/ 0/ 90091 

Ol IIUI U|JIClbL ^l_l ILUo Cl a/., tLXJ\Jc. ) 


U581552 (1) 


Chloroplast 
nd 


SL2.31sc03748 

OI_^l . O I oUUt I OiJ 


T0408 
T1 560 


1 1 (26 cM) 


3-Dehydroquinate synthase (DHCS, EC 4.2.3.4) 


At5g66120 (442) 


Chloroplast 


U568781 


Chloroplast 


SL2.31sc03665 


C2_At3g01160 


2 (83.4 CM) 


Shikimate dehydrogenase (SDH, EC 1 .1 .1 .25) / 
3-Dehydroquinate dehydratase 
(DHQ, EC 4.2.1.10) 


At3g06350 (603) 


Chloroplast 


U570855(1) 
U570070(2) 


nd 
nd 


SL2.31sc05941 
SL2.31SC03622 


T1704 
TG221 


1 (39 CM) 
6 (101 cM) 


Shikimate kinase (SK, EC 2.7.1.71) 


At2g21940 (276) 
At4g39540 (300) 


nd 

Chloroplast 


U582040 


Chloroplast 


SL2.31SC06101 


C2_At3g62940 


4 (56 CM) 
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Enzyme 3 


A. thaliana locus 

fno aminn 3f*iri<;l 

iiici> ci 1 1 1 1 1 1 c/ ai/iuoi 


Curated localization b 


Tomato unigene c 


Signal peptide 0 ' 


Genomic id e 


Linked 

markpr ' 

1 1 1 ci i rc > i 


Chromosome 

UUOI IIUI 1 lullll 


5-Enolpyruvylshikimate-3-P synthase 
(EPSPS, EC 2.5.1.19) 


At1g48860 (521) 
At2g45300 (520) 


Chloroplast 
Chloroplast 


U577580 


Chloroplast 


SL2.31SC04323 


C2_At2g45240 


1 (57 CM) 


Chorismate synthase (CS, EC 4.2.3.5) 


At1g48850 (380) 


Chloroplast 


11563165(1) 
U563163(2) 


Chloroplast 
Chloroplast 


SL2.31SC06101 
SL2.31sc03604 


C2_At3g07950 
TG370 


4 (56.7 CM) 
4 (21.5 CM) 


Chorismate mutase (CM, EC 5.4.99.5) 


At1g69370 (316) 
At5g10870 (340) 
At3g29200 (265) 


Chloroplast 
(Mobleyef a/., 1999) 
Chloroplast 
(Eberhard ef a/., 1996) 
Cytosol 


U575627(1) 
U585231 (2) 


nd 
nd 


SL2.31 SC03665 
SL2.31SC03748 


T1480 
TG147 


2 (106 CM) 
1 1 (45 CM) 



(Eberhard ef a/., 1996) 



Prephenate aminotransferase (PAT, EC 2.6.1 .57) At2g22250 (475) 



Chloroplast 



U567172 



Chloroplast 



SL2.31SC06101 C2_At4g39830 4 (61 CM) 



Arogenate dehydrogenase (TyrA, EC 1 .3.1 .78) 


At1g15710 (358) 
At5g34930 (640) 


Chloroplast 

Chloroplast (Rippert era/., 2009) 


U567861 (1) 
U570951 (2) 


Chloroplast 
Chloroplast 


SL2.31SC03731 
SL2.31SC03771 


C2_At5g34850 
T1212 


7 (0.4 cM) 
9 (48 CM) 


Tyrosine aminotransferase (TAT, EC 2.6.1.5) 


At5g53970(414) 


nd 


11577103(1) 
U563404 (2) 


nd 
nd 


SL2.31sc05925 
SL2.31sc03685 


C2_At1g53000 
C2_At1g03820 


10 (7.5 CM) 
7 (43 CM) 


4-Hydroxyphenylpyruvate dioxygenase 
(HPPD, EC 1.13.11.27) 


At1 g06570(473) 


Cytosol (Garcia ef a/., 1 999) 


11580457(1) 
U578997(2) 


nd 

Chloroplast 


SL2.31sc03685 
SL2.31sc03902 


TG584 
CLET-6-I4 


7 (36.5 CM) 
5 (70 CM) 


Homogentisate geranylgeranyl transferase/ 
homogentisate solanesyl transferase 
(HGGT/HST, EC 2.5.1.-) 


At3g1 1945.2 (393) 


Chloroplast 


U585005 


Chloroplast 


SL2.31SC06725 


C2_At3g58490 


3 (72.6cM) 


Homogentisate phytyl transferase [HPT (VTE2), 
EC 2.5.1.-] 


At2g 18950(393) 


Chloroplast 


U327540(5') U576207(3') 


Chloroplast 


SL2.31SC03731 


CTOA-13-K15 


7 (17 CM) 


Dimethyl-phytylquinol methyl transferase 
[MPBQMT (VTE3), EC 2.1.1.-] 


At3g6341 0(338) 


Chloroplast 


U578249(1) 
U581 492(2) 


Chloroplast 
Chloroplast 


SL2.31SC04777 
SL2.31SC04439 


T0565 
TG324 


9 (52 CM) 
3 (4.6 CM) 


Tocopherol cyclase [TC (VTE1), EC 5.3.-.-] 


At4g32770(488) 


Chloroplast 


U570602 


Chloroplast 


SL2.31SC04948 


C2_At4g32770 


8 (34 CM) 


y-Tocopherol C-methyl transferase 
[y-TMT (VTE4), EC 2.1.1.95] 


At1g64970 (348) 


Chloroplast (Ferro ef a/., 2010) 


U58451 1 


Chloroplast 


SL2.31 SC03923 


TG282 


8 (41 .8 CM) 


Phytol kinase [PK (VTE5), EC 2.7.-.-] 


At5g04490(304) 


Chloroplast 


U583081 


Chloroplast 


SL2.31SC03771 


C2_At5g58240 


9 (50.5 CM) 


Anthranilate phosphoribosyltransferase 
(APT, EC 2.4.2.18) 


At5g 17990(444) 


Chloroplast 


U566340 


Chloroplast 


SL2.31SC05054 


C2_At5g1 7990 


6 (59 CM) 


Phosphoribosylanthranilate isomerase 
(PRAI, EC 5.3.1.24) 


At1g07780 (275) 
At1g29410 (244) 
At5g05590 (275) 


Chloroplast 

Chloroplast (Zhao and Last, 1995) 
Chloroplast 


U564371 


Chloroplast 


SL2.31SC05732 


cLET-1-113 


6 (24 CM) 
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cycles of 94 °C for 15 s; primer-specific annealing temperature for 
ST 30 s; and then 68 °C for 2 min. Amplification products were 

purified with GFX purification Kit (Amersham Biosciences, 
g #289034-70) and cloned into a pCR-Blunt II TOPO vector using 

"35 a TOPO-Zero Blunt cloning kit (Invitrogen, #45-0245). Plasmid 

8 DNA was isolated using a Qiagen Miniprep Kit (#27106) and 

| inserts were sequenced with BigDye Terminator (Applied Biosys- 

tems, #4336919) on an ABI3700 automated sequencer (Applied 
Biosystems). Sequence data from this article have been deposited 
:§ in the GenBank Data Libraries under accession number 

HQ014366-HQ014383 and HQ219713-HQ219716. Polymor- 
phisms were detected at nucleotide and amino acid levels by 
aligning S. pennellii and S. lycopersicum sequenced alleles (exclud- 
es, ing primer regions) using the MULTALIN program (http://www- 
Q- archbac.u-psud.fr/genomics/multalin.html; Corpet, 1988). 

p 
_c 

° Evolutionary analyses 
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The coding region alignments of A. thaliana, S. lycopersicum, and 
S. pennellii were performed with BioEdit Sequence Alignment 
Editor (Hall, 1999) using the ClustalW package (vl.81 Thompson 
% - et at, 1994) and were manually curated according to amino acid 

p to alignment. Non-synonymous (d N ) and synonymous (d s ) distances 

p * and their SE values were estimated with MEGA 4.1 software 

CD c 

g- o (Tamura et al. , 2007) using the Nei-Gojobori method (p-distance). 

^ ig In order to preserve the reading frames, the alignment gaps were 

55- =3 deleted prior to estimation of d s and d N . Codon bias was 

determined by the effective number of codons (N c ) value computed 
lu in the CodonW program (mobyle.pasteur.fr/cgi-bin/portal.py?- 

<r form=codonw). N c varies between 21 for maximum codon bias, 

when only one codon is used per amino acid, and 61 for minimum 
codon bias, when synonymous codons for each amino acid are 
t; used at similar frequencies. One-way ANOVA with Tukey's post- 

"o cd hoc test in the InfoStat software was performed to evaluate 

significant differences in codon usage. 
S In order to compare codon evolution models to determine 
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selective constraint, three models were fitted using the CODEML 

■S3 B program of the PAML suite (Yang, 2007). The first model, MO, 

■§ =g assumes that all codons across the sequences have the same level of 

g; -g d N and d s and estimates these values and the d N /d s ratio (o). co is 

-§ ^ 3 a signal of the selection at protein level thus, 0<co<l indicates 

"o ro |~ purifying selection, co=l neutral selection, and a>>l points to the 

ro B aTc\j presence of positive selection. The model Mia proposes the 

too g c existence of two classes of codon, a proportion with 0<co<l and 

a? cd ^ ^ the remainder of codons with co= 1 . Finally, model M2a divides 

o 5=- ° g codons into three classes: those with 0<k><1, ro=l, and ros^l. The 

^c^m o fit of model MO versus Mia or Mia versus M2a is evaluated by 

§ ft P Q- §j ° a likelihood ratio test comparing twice the difference in log 

m c E g, a> z likelihoods with a y 2 distribution (Yang, 2007). 
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Identification, mapping, and expression analyses of 
tomato genes involved in VTE biosynthesis 



~Q c o o c 5 ' 
o .2 o 13 co p ^ 

8 S £ g- <s o of the convergence of the plastidial MEP and SK pathways, 

sll o o c c D With the aim of identifying every metabolic step in tocochro- 

Eoi'^Il^ manol biosynthesis in tomato, the two routes from their 

^i-^lgcoSl-D primary metabolism precursors were linked to the VTE core 

S j= a 9 a) 0 g. pathway (Fig. 1) based on data available in the KEGG 

cg§-5tB|go-E database. After an in-depth search of the unigene database 

|=|j2g§--S£o'ffi deposited in the Solanaceae Genomics Network (http://solgeno- 

n£ § -§ ^ § -i ~ "5 mics.net), using Arabiclopsis loci as reference sequences, all 

^'^a'-o'v^ac' C - tne putative tomato enzyme encoding genes were identified 

z (Table 1). Twenty-nine biochemical reactions constitute the 
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Fig. 1. VTE biosysnthesis pathway. The MEP, SK, and tocopherol core pathways are highlighted in red, green, and blue, respectively. 
Candidate genes of the VTE-related pathways (carotenoid, chlorophyll, tryptophan and folate metabolism, and the SEC14 protein) 
are not highlighted. Enzymes are named according to their abbreviations in Table 1 . Genes for which wild alleles were cloned, 
sequenced, and analysed are underlined. 



VTE biosynthesis pathway and these are catalyzed by 28 
enzymes, for which a total of 41 different tomato encoding loci 
were described. For eleven of the enzymes, the number of 
surveyed tomato genes differs from those described for 
Arabidopsis, according to the criteria adopted here. Only one 
point in the pathway shown in Fig. 1 remains obscure in plant 
metabolism, which is the phosphorylation of phytyl P to 
provide phytyl 2P from phytol via VTE5, as an alternative to 
the MEP pathway (Valentin et al, 2006). 

The subcellular localization of the protein products of 
previously identified S. lycopersicum loci were predicted by 
the TargetP and ChloroP softwares. Enzymes that did not 
contain a predicted targeting peptide were considered 
cytosolic. This in silico prediction was in general agreement 
with the experimental evidence reported for the Arabidopsis 
orthologues (Table 1). For two proteins, however, predic- 
tions revealed unexpected results. The tomato geranyl 
pyrophosphate synthase (GPPS) enzyme was predicted to 
be targeted to the mitochondria, while for the bi-functional 



shikimate dehydrogenase/3-dehydroquinate dehydratase 
(SDH/DHQ) no signal peptides were detected for any of 
the identified loci. Moreover, none of the tomato CMs 
presented a predicted plastid signal peptide. The fact that 
the last enzyme of the post-chorismate portion of the SK 
pathway, tyrosine aminotransferase (TAT), and HPPD 
appeared to be localized in the cytosol in Arabidopsis cells 
provides intrigue regarding the transport of homogentisate 
across the chloroplast envelope (Joyard et al, 2009). These 
predictions, with regard to the tomato proteins, also failed 
to propose a subcellular localization for TAT, even though 
one of the HPPD unigenes exhibited a chloroplast signal 
peptide prediction. 

As a second step in the characterization of the genetic 
basis of tocochromanol biosynthesis, the 41 tomato loci 
involved in MEP, SK, and VTE core pathways were 
localized onto the tomato genetic map (Tomato-EXPEN 
2000) (Figs 1, 2, Table 1). Mapping was based on physical 
linkage between mapped markers and the unigenes and/or 
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Fig. 2. Genomic localization of tocopherol biosynthesis and candidate genes. All genes were localized in the Tomato-EXPEN 2000 
genetic map available at the Solanaceae Genomics Network (http://solgenomics.net/index.pl). Markers and genes are indicated on the 
left side of the chromosomes. Tocopherol QTL are indicated on the right side of the chromosomes. Gene colour code is in accordance 
with Fig. 1 . 



their corresponding genomic sequences. With the excep- 
tion of chromosome 12, all other tomato chromosomes 
harbour at least one of the identified loci. Interestingly, 
most of the tocochromanol core pathway enzyme- 
encoding loci were localized on chromosomes 7, 8, and 9, 
except for hst and vte3(2), mapping to chromosome 3, 
and hppd(2), mapping to chromosome 5. Remarkably, 
the four genes mapped on chromosome 9, vte3(l), vte5, 
arogenate dehydrogenase [tyra(2)], and geranylgeranyl 
pyrophosphate synthase [ggps(4J] all co-localize with 
a previously reported QTL for a-tocopherol content 
described by Schauer et al. (2006). 

Of the 41 identified genes, the expression patterns of 27 of 
them could be evaluated across fruit development and 
ripening, by retrieving data from a previously published 
microarray experiment (Carrari et al, 2006). This analysis 
revealed that all 27 genes were expressed in tomato fruits in 
at least one time point. The *omeSOM model of neural 
clustering recently developed (Milone et al, 2010) revealed 
six groups. Whilst these clusters grouped genes from all 
three evaluated pathways, no specific pathway patterns were 
identified (data not shown). 

QTL for fruit tocopherol content and identification of 
candidate genes 

As mentioned above, these mapping results revealed that the 
majority of the VTE core pathway enzyme-encoding genes 
are grouped within chromosomes 7, 8, and 9. Previously, 



using GC-MS analysis, Schauer et al. (2006) reported two 
QTL for fruit a-tocopherol content localized to chromo- 
somes 6 and 9. In order to investigate further the presence of 
other QTL associated with the genomic regions that harbour 
the tocopherol core biosynthesis genes, and as such to obtain 
a precise and detailed quantification, an HPLC protocol for 
measuring all four tocopherol isoforms was applied. 

The content of a-, P-, y-, §-, and total tocopherol was 
determined from ripe tomato fruits from the ILs 6-1, 6-2, 7- 
4, 7-4-1, 7-5, 8-2, 8-2-1, 9-1, 9-2-6 as well as fruits from the 
corresponding S. lycopersicum control (cv M82). The 
amount of each isoform and their ratios are presented in 
Fig. 3, whereas the identified QTL are positioned on the 
genetic map presented in Fig. 2. On chromosome 9, two 
QTL were identified for a- and total tocopherol (ILs 9-1 
and 9-2-6), one for P-tocopherol (IL 9-1), and one for 
y-tocopherol (IL 9-1). Both QTL for total tocopherol are in 
agreement with the results reported in Schauer et al. (2006). 
The QTL on IL 9-2-6 co-localize with two VTE core 
pathway encoding genes: vte3(l) and vte5 (Fig. 2), whilst 
the QTL on IL 9-1 spans the genomic region containing 
tyra(2) and ggps(4). Measurements performed in fruits 
from ILs with S. pennellii introgressions on chromosome 6 
showed significant differences from M82 fruits in the levels 
of P-tocopherol (IL 6-2), 8-, and total tocopherol (IL 6-1). 
Moreover, QTL were also identified on chromosome 7 (ILs 
7-4 and 7-4-1 for a- and P-tocopherol, respectively) and 8 
(IL 8-2-1 for total tocopherol) also co-localizing with the 
genes encoding the VTE core pathway enzymes: vte2 and 
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hppd(l) on chromosome 7 and vtel and vte4 on chromo- 
some 8 (Fig. 2). Besides the mentioned genes, the presence 
of tyra( 1) at 0.4 cM and tat (2) at 43 cM on chromosome 7 
could also be responsible for the a- and (3-tocopherol QTL 
mapped onto these regions. 

By surveying the genomic regions spanning the identified 
QTL, six novel candidate genes belonging to VTE-related 
pathways were found. On chromosome 6, at 32 cM, 
a chlorophyllase encoding gene (CHL, EC 3.1.1.14) was 
identified. It has been proposed that the first step in the 
degradation of chlorophyll during senescence and fruit 
ripening is catalysed by this enzyme (Hortensteiner, 
2006) through the synthesis of phytol, providing phytyl 
2P to the VTE pathway via VTE5. Moreover, three 
enzyme-encoding genes for the two branching pathways 
from chorismate to tryptophan and folate were also 
identified: a phosphoribosylanthranilate isomerase (PRAI, 
EC 5.3.1.24), a folylpolyglutamate synthase (FPGS, EC 
6.3.2.17), and an anthranilate phosphoribosyltransferase 
(APT, EC 2.4.2.18) mapping at 24, 26, and 59 cM, 
respectively. Finally, the lycopene (3-cyclase (LYCB) encod- 
ing gene was found at 74 cM of chromosome 6. This 
enzyme, which catalyses the last step in (3-carotene bio- 
synthesis, has been deeply characterized after a positional 
cloning strategy by Ronen et al. (2000) and constitutes 
a strong candidate for tocopherol content due to the 
common precursor geranylgeranyl 2P. Another putative 
candidate, the SEC14 protein-encoding gene, was found on 
chromosome 9. Several studies have demonstrated the 
involvement of this protein in tocopherol transport in 
mammalian cells and lipid traffic in plants (Saito et al, 
2007; Bankaitis et al, 2009). With the exception of the chl 
these novel candidates are expressed in tomato fruits at least 
one time point of the developmental analysis performed by 
Carrari et al. (2006). 

Taken together the results obtained from tomato gene 
identification, mapping, tocopherol quantification, and 
QTL localization, 16 candidate loci putatively affecting 
tocopherol content in tomato can be proposed: prai, fpgs, 
chl, apt, and lycb on chromosome 6; tyra( 1), vte2, hppd( 1 ), 
and tat (2) located on chromosome 7; vtel and vte4 on 
chromosome 8; and ggps(4), tyra(2), vte5, secl4, and 
vte3(l) on chromosome 9 (Figs 1-3). 

Allele characterization of QTL- associated candidates 

The identification of 16 QTL-associated candidate genes 
prompted us to unearth the allelic differences between 
S. lycopersicum and S. pennellii (underlined genes in Fig. 1). 
The coding regions of wild alleles were thus subsequently 
cloned from the corresponding ILs using primers annealing 
to the initial and stop codons. Although only minor size 
differences were observed between the two alleles for most 
of the genes, TyrA(2)-, VTE4-, PRAI-, and CHL-encoding 
genes exhibited different amplicon length. These results 
indicated that neither vast allelic polymorphisms, nor large 
genomic rearrangements, span the chromosomal region 
encompassing the analysed genes. This comparison revealed 




Fig. 3. Tocopherol content. Tocopherol content was determined 
by HPLC. Grey bars indicate means of six biological replicates. 
Significant differences compared with the M82 control cultivar 
(black bars) according to Dunnett test (P<0.05) and/or Kruskal- 
Wallis test (P<0.05 ** and P<0.1*) are indicated. 



that all analysed candidate genes present at least one non- 
synonymous polymorphism (Table 2). The most divergent 
alleles are those encoding the PRAI enzyme for which the 
S. pennelli allele encodes a protein 26 amino acids shorter in 
comparison with the S. lycopersicum allele. 

VTE biosynthesis genes: evolutionary analyses of 
cultivated and wild tomato alleles 

The fate of cellular metabolic networks generally depends 
on the products of many loci. The inter-relationships 
between loci at the phenotypic level raise the question of 
whether they evolved independently. In this work QTL for 
tocopherol content using S. pennelli ILs were mapped, and 
16 candidate loci linked to those QTL were identified and 
the wild species alleles cloned. Furthermore, in order to 
study how the structure of the VTE metabolic pathway 
could have influenced protein evolution rates, the evolu- 
tionary pattern among the candidate genes and their 
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Table 2. Comparison of S. lycopersicum and S. pennellii alleles of cDNA encoding VTE candidate genes. LYC, S. lycopersicum; PEN, 
S. pennellii. 



Coding sequence Predicted No. of polymorphic No. of polymorphic 

Gene Unigene (nt) protein nucleotides 3 amino acids' 1 

(no. amino 
acids) 

LYC PEN LYC PEN 



ggps(4) 


U575882 


1005 


1005 


334 


334 


11 


3 


faf(2) 


U563404 


1269 


1269 


422 


422 


10 


6 


fyra(1) 


U567861 


1134 


1134 


377 


377 


7 


1 


fyra(2) 


U570951 


1173 


1182 


390 


393 


24+9insertions 


9+3insertions 


vte4 


U58451 1 


1089 


1086 


362 


361 


9+3deletions 


2+1 deletion 


vtel 


U570602 


1497 


1497 


498 


498 


12 


5 


vte3(1) 


U578249 


1020 


1020 


339 


339 


7 


1 


vte2 


U327540 (covers 5' end) 
U576207 (covers 3' end) 


1209 


1209 


402 


402 


8 


3 


vte5 


U583081 


882 


882 


293 


293 


10 


4 


hppd(1) 


U580457 


1263 


1263" 


420 


420 b 


16 


7 


apt 


U566340 c 


1097 c 


1097 c 


365 


365 


8 


4 


prai 


U564371 


906 


828 


301 


275 


12+78deletions 


3+26deletions 


fpgs 


U581922 c 


1457 c 


1457 c 


485 


485 


13 


4 


chl 


U574853 


939 


948 


312 


315 


22+9insertions 


9+3insertions 


sec 14 


U583419 


1275 


1275 


424 


424 


11 


4 


lycb 


U570109 


1497 


1497 


498 


498 


19 


9 



a Nucleotide or amino acid insertions or deletions in S. pennellii sequences. 
b S. pennellii does not present a stop codon along the analysed region. 
0 Probably lacking 3' end. 



paralogues was also investigated, estimating the pairwise 
synonymous (d s ), non-synonymous (<i N ), and d N /<i s di- 
vergence between S. lycopersicum and S. pennellii (Fig. 4). 
The d s and rf N values varied greatly between genes, ranging 
from 4.8 to 16 times for d s and <i N , respectively. Four 
genes, ggps(2), tyra(2), chl, and lycb, displayed particu- 
larly high values of c? N , above the mean. The d N /d s also 
varied remarkably among the 22 loci analysed with the 
highest value being 17.4 times the lowest. The genes of 
MEP, post-chorismate SK, and tocopherol biosynthesis 
core pathways displayed values similar to or lower than 
the average with the exception of those presenting more 
than one locus. Among candidate genes of related path- 
ways apt, chl, and lycb displayed higher d^/ds values than 
the mean. Interestingly, the five genes for which more than 
one locus was identified displayed variations of d^/d s 
values between paralogues. The d^ld^ values for the 
different ggps, tat, tyra, vte3 and hppd varied by 3.5, 2.1, 
7.6, 3.4, and 1.9 times, respectively. Differences in d^/d s 
values between the paralog ues might be due to high d^ 
caused by a weak selection at non-synonymous sites, or 
related to the intensity of natural selection on synonymous 
codon usage. To investigate the existence of codon usage 
bias, the effective number of codons (N c ) was calculated 
for each gene and species (S. pennellii and S. lycopersicum). 
No statistically significant differences (ANOVA, P>0.01) 
in N c were observed between ggps, tat, tyra, and vte3 
paralogues, suggesting that the d N /<f s differences are most 
probably due to a constraint relaxation for one of the gene 



copies rather than codon bias. In contrast, for the hppd 
pair a significant N c bias was observed (ANOVA, P<0.01). 
This can be also visualized in Fig. 4 comparing J N and d s 
values, suggesting that the d^ld s rate differences between 
hppd paralogues might not be explained by constraint 
relaxation. 

Although d^/d s is a useful indicator of selective pressure, 
there is some oversimplification in its application since it 
may be that only certain codons in a gene can change in 
a way that enhances fitness whereas all others cannot accept 
substitutions without cost to fitness. Thus, to better explore 
patterns of sequence variation, the orthologous sequences 
from A. thaliana, S. lycopersicum, and S. pennellii for all 22 
genes under study were aligned and a Likelihood Ratio Test 
(LRT) with three models of molecular evolution was 
applied. The first, MO, assumes that all positions across the 
sequences have the same level of c/ N and d s . The second, 
Mia, proposes that a proportion of codons is under 
purifying selection while the remainder have neutral evolu- 
tion. Finally, M2a divides codons into three classes, those 
with purifying selection, those with neutral evolution 
pattern, and the remainder with positive selection. In order 
to avoid data misinterpretation, the N c was estimated and 
the comparison between species did not show statistically 
significant differences (P>0.01), indicating that there is no 
codon bias usage. Comparison of the three models showed 
that for 19 genes Mia displayed the best fit, indicating that, 
even when a proportion of the codons for every gene are 
evolving neutrally, there is no support for positive selection. 
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Fig. 4. Evolutionary rates at synonymous (dS) and non-synonymous (dN) sites of candidate genes linked to tocopherol QTL. Genes 
belonging to MEP, SK, and tocopherol core pathways are highlighted in red, green, and blue, respectively. VTE-related pathway 
candidates are not highlighted. Lines indicated estimated means. 



Nevertheless, for ggps(2), ggps(4), and vtel, M2a pre- 
sented a better fit than Mia, with different levels of 
significance, showing that these genes exhibited signs of 
positive selection. Interestingly, while ggps(2) and vtel 
showed a higher proportion of codons evolving under 
positive selection (—25%) than ggps(4) (0.8%), they dis- 
played lower significance levels. This can be explained by 
the high co value of ggps(4) indicating that there is no 
variation in the synonymous sites along the codons under 
positive natural selection (Table 3). 



Discussion 

The two pathways feeding the main precursors of VTE 
biosynthesis, MEP and SK, as well as the tocopherol core 
route, were surveyed across the tomato genome with the 
aim of identifying regulatory steps of the VTE fruit 
biosynthesis. The metabolic reconstruction of VTE metab- 
olism resulted in the identification of 29 reactions catalysed 
by the protein products of 28 genes, for which 41 different 
S. lycopersicum loci were identified (Fig. 1, Table 1). In the 
course of reconstructing these routes, certain assumptions 
concerning the presence or absence of reactions involved in 
VTE metabolism had to be made. First, in the MEP 
pathway, GPPS is the enzyme leading to monoterpene 
biosynthesis, while GGPS is responsible for the production 
of geranylgeranyl 2P by the sequential coupling of three 
isopentenyl 2P (IPP) molecules to dimethylallyl 2P 
(DMAPP). Geranylgeranyl 2P serves as a precursor for 
carotenoid, tocochromanol, gibberelin, and chlorophyll 



biosynthesis (Aharoni et at, 2005; Joyard et at, 2009). 
Hence, the classical trend is to assume that GPPS would not 
be involved in tocochromanol production. Tomato plants 
silenced for GPPS even displayed a dwarfed phenotype and 
reduced gibberellin levels, and did not alter carotenoid and 
chlorophyll content (Van Schie et al, 2007). These results 
thus indicated that pigments are originated from a geranyl 
2P-independent geranylgeranyl 2P pool and suggested that 
GPPS might not influence VTE biosynthesis. However, 
further functional experiments that validate this hypothesis 
are clearly needed. Secondly, HGGT, which condenses 
homogentisate with geranylgeranyl 2P during tocotrienol 
synthesis, was identified in grass species in which tocotrie- 
nols are the most abundant tocochromanol forms (Cahoon 
et al, 2003). Neither in Arabidopsis nor in tomato was 
HGGT identified. However, tocotrienol traces have been 
detected in both tomato and tobacco (Chun et al, 2006). 
This might be explained by the promiscuous activity in 
substrate acceptance of other prenyl transferases such as 
homogentisate solanesyl transferase (HST) and/or the 
homogentisate phytyl transferase (VTE2). Flux changes 
through the SK, MEP, and/or tocopherol pathway could 
conceivably shift the substrate preference of both these 
enzymes (Herbers, 2003; Falk and Munne-Bosch, 
2010). This hypothesis is supported by the fact that under 
certain conditions tobacco plants, which do not have 
a canonical HGGT, and co-express the Arabidopsis HPPD 
and the yeast prephenate dehydrogenase, exhibit higher 
levels of tocotrienols (Rippert et al, 2004). Therefore, the 
enhanced supply of homogentisate may affect the substrate 
specificity of prenyl transferases leaking through tocotrienol 
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Table 3. Parameter estimates and tests of selection. 
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MO 




M1a 






M2a 










lnL a 




lnL a 


oi0 c 


pO"'* 


lnL a 




pO 9 






qqpsfl) 


-2640.1 


0.041 


-2598.9 


0.012 


0.734" 


-2598.9 


0.012 


0.743 




0.000 


qqps(2) 


-2594.4 


0.096 


-2551 .5 


0.028 


0.730" 


-2549.8 


0.032 


0.746 


2.474 


0.254* 


qqps(3) 


-2444.5 


0.014 


-241 1 .5 


0.018 


0.812" 


-241 1 .5 


0.018 


0.813 


1.134 


0.187 


qqps(4) 

Clip" } / 


-2233.7 


0.061 


-2200.8 


0.019 


0.811" 


-2196.7 


0.019 


0.811 


999.000 


0.008" 


tat(1) 


-2718.5 


0.101 


-2690.3 


0.044 


0.792" 


-2690.3 


0.044 


0.792 




0.000 


tat(2) 


-2857.8 


0.077 


-2841.7 


0.046 


0.817" 


-2841.7 


0.046 


0.817 




0.000 


tyra(1) 


-2487.3 


0.002 


-2461 .0 


0.024 


0.740" 


-2461 .0 


0.024 


0.740 




0.000 


tyra(2) 


-2657.4 


0.108 


-2615.1 


0.024 


0.612" 


-2615.1 


0.024 


0.612 




0.000 


vte4 


-2425.8 


0.062 


-2391 .8 


0.028 


0.765" 


-2391 .8 


0.028 


0.765 




0.000 


vtel 


-3430.5 


0.072 


-3373.4 


0.028 


0.730" 


-3371 .9 


0.039 


0.742 


2.165 


0.257* 


vte3(1) 


-2138.5 


0.051 


-2124.3 


0.031 


0.900" 


-2124.3 


0.031 


0.900 




0.000 


vte3(2) 


-2256.5 


0.057 


-2231 .5 


0.022 


0.842" 


-2231 .5 


0.022 


0.842 




0.000 


vte2 


-2693.7 


0.096 


-2653.3 


0.030 


0.739" 


-2653.3 


0.030 


0.739 




0.000 


vte5 


-2302.1 


0.079 


-2287.6 


0.023 


0.541" 


-2287.6 


0.023 


0.541 




0.000 


hppd(1) 


-2586.5 


0.064 


-2552.8 


0.024 


0.818" 


-2552.8 


0.024 


0.820 


2.258 


0.016 


hppd(2) 


-2617.9 


0.037 


-2579.5 


0.020 


0.805" 


-2579.5 


0.020 


0.805 




0.000 


apt 


-2447.7 


0.099 


-2400.6 


0.024 


0.770" 


-2399.8 


0.032 


0.799 


2.629 


0.201 


prai 


-1556.9 


0.144 


-1551.3 


0.104 


0.732" 


-1551.9 


0.104 


0.732 




0.000 


fpgs 


-3435.8 


0.100 


-3410.5 


0.055 


0.742" 


-3410.5 


0.055 


0.742 




0.000 


chl 


-2198.3 


0.127 


-2186.0 


0.078 


0.846" 


-2185.9 


0.080 


0.853 


1.124 


0.146 


sec14 


-2547.0 


0.030 


-2527.3 


0.025 


0.894" 


-2527.2 


0.025 


0.894 




0.000 


lycb 


-2628.0 


0.084 


-2608.2 


0.046 


0.772" 


-2608.2 


0.046 


0.772 




0.000 



a Log likehood of model. 

b Parameter estimate assuming a single cVcfe ratio per gene. 

c Estimated d N /d s for proportion of codons (p 0 ) under purifying selection; the rest of codons assumed to be evolving neutrally. 
d Estimated proportion of codons under purifying selection. 
e Test of M1 a versus M0. ** % 2 test using P<0.01 . 
' Estimated d N /d s for proportion of codons (p 0 ) under purifying selection. 
9 Estimated proportion of codons under purifying selection. 
h Estimated d N /d s for proportion of codons (p 2 ) under positive selection. 
' Estimated proportion of codons under positive selection. 
' Test of M2a versus M1 a. 
-, not available. 



X test using P<0.10; **% test using P<0.01 . 



synthesis (Falk and Munne-Bosch, 2010). In addition, 
A. thaliana HST, which is essential for plastoquinone-9 
biosynthesis — catalysing the condensation of solanesyl 2P 
(SDP) and HGA — also accepts farnesyl 2P and geranylger- 
anyl 2P as prenyl donors (Sadre et al, 2006; Tian et al, 
2007), providing further evidence in support of tocotrienol 
production in the absence of a specific HGGT. 

Prephenate aminotransferase (PAT, EC 2.6.1.57), which 
converts prephenate intermediate into arogenate, has been 
characterized in bacteria. Although its activity had also 
been detected in plants, no associated loci were identified 
with this enzymatic function (Tzin et al, 2009). Recently, 
two independent reports performed biochemical and func- 
tional characterization of this plant enzyme thus, complet- 
ing the identification of the genes involved in phenylalanine 
and tyrosine biosynthesis (Graindorge et al, 2010; Maeda 
et al, 2010). 

One missing links still remains within the metabolic 
network under study, which is the absence of a phytol-P 
kinase that could provide phytyl 2P as an alternative to the 
MEP pathway (Ischebeck et al, 2006; Valentin et al, 2006). 

All the MEP enzyme-encoding genes surveyed here 
are in accordance with sequences previously reported 



for tomato (Lois et al, 2000; Rohdich et al, 2000; 
Rodriguez-Concepcion et al, 2001, 2003; Botella-Pavia 
et al, 2004; Ament et al, 2006; Paetzold et al, 2010; and 
GenBank database direct submission for IPI -GQ169536 
and EU253957-), with the exception of the 2-C-methyl-D- 
erythritol 4-phosphate cytidylyltransferase (CMS)- and 
geranylgeranyl reductase (GGDR)-encoding genes, which 
have not been reported before in this species. Moreover, 
two additional loci for GGPS were identified here. For 
the SK pathway, enzyme-encoding genes involved in 
reactions upstream of the prephenate intermediate have 
already been reported in tomato (Gasser et al, 1988; 
Schmid et al, 1992; Gorlach et al, 1993, 1995; Eberhard 
et al, 1996; Bischoff et al, 1996, 2001). Moreover, novel 
loci for SDH/DHQ and CM were identified, whilst the 
loci encoding PAT, TyrA, and TAT were first described 
in tomato. 

Protein localization data are highly valuable information 
in order to elucidate gene function. The MEP and SK 
pathways are well accepted to operate in the chloroplast 
(Rippert et al, 2009). Even so, the recent identification of 
some cytosolic isoforms points to the existence of an 
extraplastidial SK pathway (Ding et al, 2007). The results 
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presented here for the in silico prediction of the subcellular 
localization of the tomato deduced protein sequences in- 
dicated that the reactions of VTE metabolism occur mostly 
in chloroplasts. In contrast, for the bi-functional SDH/ 
DHQ no signal peptide was detected, most probably due to 
the failure to detect true targeting, since reported analysis of 
subcellular fractions indicated that tomato SDH/DHQ is 
localized in the chloroplast (Bischoff et al, 2001). However, 
Ding et al. (2007) functionally characterized a cytosolic 
SDH/DHQ isoform in tobacco and identified the tomato 
orthologue (BF096277/SGN-U578253). This locus was not 
included in this study due to its low identity to Arabidopsis 
protein according to the Pipeline criteria adopted and the 
absence of functional support so far. Unsuccessful sub- 
cellular prediction is not an unusual scenario, since in silico, 
signal peptides are occasionally misidentified. However, 
these predictions indicated mitochondrial targeting for the 
A. thaliana GPPS (data not shown), whilst experimental 
data revealed that this enzyme is directed to the plastid 
(Bouvier et al, 2000). It is important to point out the 
existence of some ambiguous reports on actual GPPS 
localization where experimental evidence supports its local- 
ization either in plastid or in the cytosol depending on the 
species (Nagegowda, 2010). Moreover, for the TAT and 
CM enzymes, the plastidial localization could not be 
confirmed by in silico analysis in agreement with the 
undetermined subcellular occurrence of the post-chorismate 
portion of the SK pathway. 

Following the identification of the tomato genes encoding 
all enzymes for the tocochromanol synthesis pathway, the 41 
loci were further mapped. Furthermore, a detailed profile for 
all tocopherol isoforms was performed in fruits from 
ILs harbouring introgressed regions spanning the VTE bio- 
synthesis core pathway genes (ILs 7-4, 7-4-1, 8-2, 8-2-1, 9-1, 
and 9-2-6) along with those in chromosome 6 (ILs 6-1 and 6- 
2) for which a-tocopherol QTL have been previously reported 
(Schauer et al, 2006). Although VTE content has been 
demonstrated to be a highly environmentally affected trait 
(Schauer et al. , 2008), the results reported here indicate that at 
least those QTL mapped on chromosomes 6 and 9 show 
a relatively high heritability level as they are in accordance 
with previous experiments reported by Schauer et al. (2006). 
Moreover it is worth noting that, as opposed to the previous 
study, which was carried out on field-grown plants, the data 
reported here were obtained from greenhouse plants. 

Integrated analysis at metabolic, genomic, and genetic 
levels allowed us to propose 16 candidate loci putatively 
affecting tocopherol content in tomato: prai, fpgs, chl, apt, 
and lycb on chromosome 6; tyra(l), vte2, hppd(l), and 
tat (2) located on chromosome 7; vtel and vte4 on 
chromosome 8; and ggps(4), tyra(2), vte5, secl4, and 
vte3(l) on chromosome 9. In plants, several QTL control- 
ling tocopherol content have been identified in soybean, 
maize, oilseed rape, and Arabidopsis. As revealed in this 
report, only some of those QTL localized to areas of the 
genome where tocopherol core pathway genes occur 
(Marwede et al, 2005; Gilliland et al, 2006; Chander et al, 
2008; Met al, 2010). 



Detailed analysis of the identified QTL together with the 
co-localizing genes raised interesting features concerning 
VTE content regulation. None of chromosome 6 QTL co- 
localize with any of the MEP, SK, or tocopherol core 
pathway genes, thus suggesting that tocopherol content 
variation observed in ILs 6-1 and 6-2 might be determined 
by the effect of genes belonging to VTE-related pathways. 
IL 6-1 showed elevated 8- and total tocopherol content in 
comparison with S. lycopersicum (M82) control. This line 
harbours the S. pennellii alleles of the prai, fpgs, and chl 
genes. The first two could be regulating hydroxyphenylpyr- 
uvate fluxes into the tocopherol pathway by the deviation of 
chorismate from tryptophan and folate biosynthetic routes. 
On the other point of the pathway drawn in Fig. 1, the 
hypothesis that chlorophyll degradation-derived phytol 
serves as an important intermediate for tocopherol synthesis 
has been demonstrated by characterization of the Arabidop- 
sis vte5 mutant (Valentin et al, 2006). In this sense, the 
presence of S. pennellii chl allele in IL 6-1 might be raising 
the phytyl 2P input into tocopherol synthesis. On the same 
chromosome, IL 6-2 displayed significantly higher levels of 
(3-tocopherol than the control. This IL carries S. pennellii 
alleles of chl, apt, and lycb. Even when these genes are 
linked to intermediate metabolites of the tocopherol path- 
way (chorismate and geranylgeranyl 2P) no evident links 
can be specifically associated with the (3-isoform. The 
candidature of genes not directly involved in the VTE 
structural pathway is supported by the regulatory network 
acting on branching points. The CM acting on the SK 
pathway is allosterically feedback-inhibited in plants by 
phenylalanine and tyrosine and induced by tryptophan 
(Tzin and Galili, 2010). Allelic variation in prai and apt 
could be modifying tryptophan synthesis, altering influx 
through the SK pathway and then, increasing homogenti- 
sate precursor, finally resulting in the tocopherol content 
variation observed in ILs 6-1 and 6-2. 

Two QTL have been detected on chromosome 7; while IL 

7- 4 displayed lower levels of a-tocopherol, IL 7-4-1 showed 
reduced amounts of the p-isoform. The introgressed wild 
genome fragments in these ILs harbour S. pennellii alleles of 
tyra( 1), vte2, and hppd( 1). IL 7-4 also spans the wild allele 
of tat (2). These four candidates can alter total precursor 
influx to tocopherol biosynthesis. Nevertheless, the way that 
these genes could differentially modify the amounts of 
tocopherol isoforms is currently unclear. 

On chromosome 8, the IL 8-2 displayed a significantly 
lower a/y-tocopherol ratio when compared with control 
(Fig. 3). Interestingly, this IL bears S. pennellii alleles of 
vtel and vte4 genes whose protein product activities 
synthesize these two tocopherol isoforms. Therefore, the 
low a/y-tocopherol ratio could be caused by lower VTE1 
and/or higher VTE4 activity of wild alleles. Intriguingly, IL 

8- 2-1 also harbours the wild alleles of vtel and vte4. Even 
though no significant alteration in a/y-tocopherol ratio was 
observed, a significant increase in total tocopherol was 
detected in the fruits. 

IL 9-1 exhibits increased levels of a-, P-, and total 
tocopherol most probably due to differential activity levels 
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of the enzymes encoded by the S. pennellii allele of the 
tyra(2) and ggps(4) loci that could lead to higher input of 
hydroxyphenylpyruvate and phytyl 2P to the tocopherol 
core pathway. Elevated a/y and p/y ratios support this 
hypothesis (Fig. 3). Interesting to note is the fact that 
tyra(2) shows one of the highest J N values, indicating 
protein divergences between S. pennellii and S. lycopersicum 
(Fig. 4). In this sense, the results presented here reinforce 
the hypothesis of Rippert et al. (2004) that hydroxyphenyl- 
pyruvate is a key step in the accumulation of VTE in plants. 
IL 9-2-6 exhibits increased levels of a- and total tocopherol 
and harbours the S. pennellii allele of the vte5 gene, which 
could improve the input of phytyl 2P via the phytol 
alternative pathway (Valentin et al, 2006). However, levels 
of P-tocopherol are not increased in IL 9-2-6, which could 
be explained by the more efficient activity of the S. pennellii 
vte3 allele. This is also reflected in the significant differences 
found in the tx/p ratio (Fig. 3). 

Although there is a reliable link between the identified 
candidate genes and fruit tocopherol content, the effect of 
unidentified loci within the QTL cannot be discarded and, 
due to the considerable size of the S. pennellii fragments in 
the ILs, the differences observed in tocopherol accumula- 
tion could also be caused by undetected genes located 
within them. 

Candidate gene approach has proved to be extremely 
powerful for studying the genetic architecture of complex 
traits (Zhu and Zao, 2007). By revealing the pattern of 
molecular genetic variation, the evolutionary analyses offer 
complementary data for strengthening gene candidature 
(Moyle and Muir, 2010). In this sense, the pattern of 
selection of different genes within a metabolic pathway 
allows the determination of whether they are subject to 
equivalent evolutionary forces underlying trait phenotypic 
variation. Comparing S. lycopersicum and S. pennellii 
alleles, out of the 22 genes studied, those encoding MEP, 
post-chorismate SK, and VTE core pathway enzymes 
presented d N /d s ratios below the mean, excluding those with 
more than one loci. In contrast, out of the six analysed 
candidate genes of related pathways, apt, chl, and lycb 
displayed d^/d s ratios values above the mean (Fig. 4). This 
constraint relaxation cannot be related to a higher region- 
specific mutation rate on chromosome 6 because two other 
genes mapped also on chromosome 6, prai and fpgs, 
presented low d N ld s ratios. Therefore, these results suggest 
a strong purifying selection for tocopherol central bio- 
synthesis genes while there is a more relaxed constraint for 
those genes of related pathways. When the selection 
constraint is evaluated codon by codon applying a likeli- 
hood ratio test, new insights about the evolutionary history 
of the genes are revealed. Nineteen genes exhibit purifying 
selection associated with neutral evolving codons (Table 3) 
in agreement with the d^/d s analysis and previous reports 
that concluded that significant heterogeneity of evolution- 
ary rates in metabolic pathway genes is mainly ascribed to 
differential constraint relaxation rather than to positive 
selection (Livingston and Anderson, 2009; Yang et al, 
2009). Even so, three loci exhibited patterns consistent with 



positive selection evolving codons. In tomato, loci showing 
positive selection have been identified associated with biotic 
and abiotic stresses (Jimenez-Gomez and Maloof, 2009). It 
would be unexpected to envisage signs of positive selection 
in loci of major biosynthetic pathways that feed multiple 
metabolic routes such as MEP and SK. However, the signs 
of diversifying selection found for ggps(2) and ggps(4) 
could be explained by the existence of two other paralogues 
evolving under a more conservative evolutionary pattern. In 
the case of vtel, the reduction in protein negative selective 
pressure might indicate that this is not a committed step in 
tocopherol production. 

Studies of evolutionary rates of genes in the plant 
anthocyanin (Lu and Rausher, 2003) and carotenoid 
(Livingstone and Anderson, 2009) pathways have demon- 
strated that upstream genes in the pathway evolved more 
slowly than downstream genes. However, this seems not to 
be a constant trend. Downstream genes in the gibberellin 
pathway did not exhibit elevated substitution rates and 
instead, genes encoding either the branch point enzyme or 
those catalysing multiple steps in the pathway showed the 
lowest evolutionary rates due to strong purifying selection 
(Yang et al., 2009). This observation is in close agreement 
with the theory of pathway fluxes, which indicates that 
natural selection would target enzymes controlling meta- 
bolic fluxes between converging pathways. Consequently, 
these branch points are usually targets of selection, experi- 
encing higher evolutionary constraints (Flowers et al., 
2007). In this sense, the lowest value of d^/d s was observed 
for the tyra(l) gene whose protein product shares its 
substrate with phenylalanine/tyrosine biosynthesis resulting 
in branching points, whilst the tyra(2) paralogue displays 
a relaxed evolutionary constraint, indicative of a functional 
divergence. 

Genes responsible for adaptive morphological and phys- 
iological differences between species carry signatures of 
positive selection (Aguileta et al, 2010). In this sense, 
regarding the variation in VTE content observed in the 
S. pennellii introgressed lines in comparison with that of 
S. lycopersicum, the presence of neutral and/or positive 
evolving codons could result in novel protein features being 
a source of new functional profiles. Even when coding 
sequences are relevant to phenotype, they might not be the 
location at which key evolutionary changes occur. Analyses 
across coding sequences do not reveal allelic differences in 
regulatory sequences that could also be determining the 
observed phenotypic variations. In fact, a co-response 
analysis of 32 of the genes identified here revealed an 
intricate network suggesting these pathways to be finely 
regulated at the level of gene expression (data not shown). 

This report describes a comprehensive survey of the genes 
encoding VTE biosynthesis pathway enzymes in tomato, 
and the methods adopted allowed the identification of novel 
tocopherol QTL. By an integrated analysis of the genome 
sequence data together with a well-characterized biosyn- 
thetic pathway, like that for VTE in Arabidopsis model 
species, this genetic/genomic approach described loci and 
allelic variations that probably impact antioxidant content 
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in tomato fruit. The identified candidate genes support 
cross-talk between the MEP, SK, and tocopherol core 
pathways through the control of VTE accumulation in 
tomato fruit. In addition, the VTE-related pathway genes 
might contribute to regulation of the supply of intermedi- 
ates for plastid tocopherol biosynthesis. The data produced 
provide a platform for functional studies that will contrib- 
ute to elucidation of the biosynthesis and catabolism of 
tocochromanols, and their role in plant physiology. 

Supplementary data 

Supplementary data are available at JXB online. 
Supplementary Table SI lists the gene primers used. 
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